Speech activity detection fusing acoustic phonetic and energy features

نویسندگان

  • Etienne Marcheret
  • Karthik Visweswariah
  • Gerasimos Potamianos
چکیده

With the wider deployment of automatic speech recognition (ASR) systems, the importance of robust speech activity detection has been elevated both as a means of reducing bandwidth in client/server ASR and for overall system stability from barge-in through the recognition process. In this paper we investigate a novel technique for speech activity detection, that we have found to be effective in handling non-stationary noise events without negatively impacting the recognition process. This technique is based on combining acoustic phonetic likelihood based features with energy features extracted from the signal waveform. Reported results on two speech activity detection tasks demonstrate that the proposed method outperforms techniques which rely solely on acoustic or energy features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of the Relationship between Acoustic Features of “bæle” and the Paralinguistic Information

Language users benefit from special phonetic tools in order to communicate linguistic information as well as different emotional aspects and paralinguistic information through daily conversation. Having functions in conveying semantic information to listeners, prosodic features form the essential part of linguistic behavour, manipulating  them potentially can play an important role in transmitt...

متن کامل

Title of dissertation : SPEECH RECOGNITION BASED ON PHONETIC FEATURES AND ACOUSTIC LANDMARKS

Title of dissertation: SPEECH RECOGNITION BASED ON PHONETIC FEATURES AND ACOUSTIC LANDMARKS Amit Juneja, Doctor of Philosophy, 2004 Dissertation directed by: Carol Espy-Wilson Department of Electrical and Computer Engineering A probabilistic and statistical framework is presented for automatic speech recognition based on a phonetic feature representation of speech sounds. In this acoustic-phone...

متن کامل

A probabilistic framework for landmark detection based on phonetic features for automatic speech recognition.

A probabilistic framework for a landmark-based approach to speech recognition is presented for obtaining multiple landmark sequences in continuous speech. The landmark detection module uses as input acoustic parameters (APs) that capture the acoustic correlates of some of the manner-based phonetic features. The landmarks include stop bursts, vowel onsets, syllabic peaks and dips, fricative onse...

متن کامل

Phonetic unit localization in a multi-expert recognition system

This paper describes an acoustic-to-phonetic decoder (APD) (based on a mixed strategy: a) bottom-up which hypothesizes the most robust information about the speech signal, b) top-down which makes some verifications about the acoustic features or about the macro-class localization on the speech signal. In this paper, only the bottom-up strategy is described. In our system, a phoneme is described...

متن کامل

Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine

In this paper, we argue the way of modeling speech signals based on three-way restricted Boltzmann machine (3WRBM) for separating phonetic-related information and speaker-related information from an observed signal automatically. The proposed model is an energy-based probabilistic model that includes three-way potentials of three variables: acoustic features, latent phonetic features, and speak...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005